Reinforcement Learning: A Tutorial Survey and Recent Advances

نویسنده

  • Abhijit Gosavi
چکیده

In the last few years, Reinforcement Learning (RL), also called adaptive (or approximate) dynamic programming (ADP), has emerged as a powerful tool for solving complex sequential decision-making problems in control theory. Although seminal research in this area was performed in the artificial intelligence (AI) community, more recently, it has attracted the attention of optimization theorists because of several noteworthy success stories from operations management. It is on large-scale and complex problems of dynamic optimization, in particular the Markov decision problem (MDP) and its variants, that the power of RL becomes more obvious. It has been known for many years that on large-scale MDPs, the curse of dimensionality and the curse of modeling render classical dynamic programming (DP) ineffective. The excitement in RL stems from its direct attack on these curses, allowing it to solve problems that were considered intractable, via classical DP, in the past. The success of RL is due to its strong mathematical roots in the principles of DP, Monte Carlo simulation, function approximation, and AI. Topics treated in some detail in this survey are: Temporal differences, Q-Learning, semi-MDPs and stochastic games. Several recent advances in RL, e.g., policy gradients and hierarchical RL, are covered along with references. Pointers to numerous examples of applications are provided. This overview is aimed at uncovering the mathematical roots of this science, so that readers gain a clear understanding of the core concepts and are able to use them in their own research. The survey points to more than 100 references from the literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement Learning in Neural Networks: A Survey

In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...

متن کامل

Reinforcement Learning in Neural Networks: A Survey

In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...

متن کامل

A Tutorial Survey of Reinforcement Learn- Ing

This paper gives a compact, self{contained tutorial survey of reinforcement learning, a tool that is increasingly nding application in the development of intelligent dynamic systems. Research on reinforcement learning during the past decade has led to the development of a variety of useful algorithms. This paper surveys the literature and presents the algorithms in a cohesive framework.

متن کامل

Tutorial: Recent Advances in Deep Learning

The past several years have seen a dramatic acceleration in artificial intelligence (AI) research, driven in large part by innovations in deep learning and reinforcement learning (RL) methods. The relevant developments, as showcased in a series of recent high-profile publications in Nature and elsewhere (e.g., Graves et al., 2016; Mnih et al., 2015; Silver et al., 2016), have generated intense ...

متن کامل

Optimizing Tutorial Planning in Educational Games: A Modular Reinforcement Learning Approach

Recent years have seen a growing interest in educational games, which integrate the engaging features of digital games with the personalized learning functionalities of intelligent tutoring systems. A key challenge in creating educational games, particularly those supported with interactive narrative, is devising narrativecentered tutorial planners, which dynamically adapt gameplay events to in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • INFORMS Journal on Computing

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2009